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Abstract 

This paper develops a category-theoretic approach to uncertainty, in- 
formativeness and decision-making problems. It is based on appropriate 
first order fuzzy logic in which not only logical connectives but also quan- 
tifiers have fuzzy interpretation. It is shown that all fundamental concepts 
of probability and statistics such as joint distribution, conditional distribu- 
tion, etc., have meaningful analogs in new context. This approach makes it 
possible to utilize rich conceptual experience of statistics. Connection with 
underlying fuzzy logic reveals the logical semantics for fuzzy decision mak- 
ing. Decision-making problems within the framework of IT-categories and 
generalizes Bayesian approach to decision-making with a prior information 
are considered. It leads to fuzzy Bayesian approach in decision making and 
provides methods for construction of optimal strategies. 
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1 Introduction 



The theory of fuzzy sets is used more and more widely in the description of 
uncertainty. Indeed, very often some poorly formalizable notions or expert 
knowledge are readily expressed in terms of fuzzy sets. In particular, fuzzy 
sets are extremely convenient in descriptions of linguistic uncertainties [1]. 
On the other hand, fuzzy notions themselves often admit flexible linguistic 
interpretations. This makes the exploitation of fuzzy sets especially natural 
and illustrative. 

It is well known that fuzzy set theory suggests a wide range of specific ap- 
proaches to decision problems in a fuzzy environment [2,3]. However, it still 
cannot easily compete with probability theory in dealing with uncertainty. 
Certainly, mathematical statistics (more precisely, statistical decision the- 
ory) has accumulated a rich conceptual experience. 

It was shown in [4, 5] that the basic constructions and propositions of 
probability theory and statistics playing the fundamental role in decision- 
making problems have meaningful counterparts in fuzzy theories. It makes 
it possible to use the methodology of statistical decision-making in the fuzzy 
context. In particular, the fuzzy variant of the Bayes principle derived in 
this paper plays the same role in fuzzy decision-making problems as its 
probabilistic prototype in the theory of statistical games [7] . 

In contrast to the approach of [4] where all the notions were introduced 
"operationally" in order to be more close to the similar notions in statistics, 
in this paper all the notions are introduced "logically," i.e., by the corre- 
sponding formulas in an appropriate fuzzy logic of first order. In this logic 
not only logical connectives but also quantifiers have fuzzy interpretation. 

Connection with underlying fuzzy logic provides an interesting logical 
semantics of fuzzy decision making. Indeed in this approach a priory infor- 
mation may be represented by a fuzzy predicate and an experiment - by 
certain fuzzy relation. The loss function is replaced by fuzzy relation "good 
decision for," mathematical expectation operator - by fuzzy universal quan- 
tifier, etc. As a result the notion "good decision strategy" is expressed by 
a first order formula in this logic. 

It is convenient to consider different systems that take place in informa- 
tion acquiring and processing as particular cases of so-called information 
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transformers (ITs). Besides, it is useful to work with families of ITs in 
which certain operations, e.g., sequential and parallel compositions are de- 
fined. 

It was noticed fairly long ago [9-13], that the adequate algebraic struc- 
ture for describing information transformers (initially for the study of sta- 
tistical experiments) is the structure of category [14-17]. 

Analysis of general properties for the classes of linear, multivalued, and 
fuzzy information transformers, studied in [4,13,18-23], allowed to extract 
general features shared by all these classes. Namely, each of these classes can 
be considered as a family of morphisms in an appropriate category, where 
the composition of information transformers corresponds to their "consec- 
utive application." Each category of ITs (or IT-category) contains a sub- 
category (of so called, deterministic ITs) that has products. Moreover, the 
operation of morphism product is extended in a "coherent way" to the whole 
category of ITs. 

The work [24, 25] undertook an attempt to formulate a set of "elemen- 
tary" axioms for a category of ITs, which would be sufficient for an abstract 
expression of the basic concepts of the theory of information transform- 
ers and for study of informativeness, decision problems, etc. This paper 
proposes another, significantly more compact axiomatic for a category of 
ITs. According to this axiomatic a category of ITs is defined in effect as a 
monoidal category [14, 16], containing a subcategory (of deterministic ITs) 
with finite products. 

Among the basic concepts connected to information transformers there 
is one that plays an important role in the uniform construction of a wide 
spectrum of IT-categories — the concept of distribution. Indeed, fairly often 
an IT a: A — > B can be represented by a mapping from A to the "space of 
distributions" on B (see, e.g., [4,5,21]). For example, a probabilistic transi- 
tion distribution (an IT in the category of stochastic ITs) can be represented 
by a certain measurable mapping from A to the space of distributions on 
B. This observation suggests to construct a category of ITs as a Kleisli 
category [14,26,27], arising from the following components: an obvious cat- 
egory of deterministic ITs; a functor that takes an object A to the object of 
"distributions" on A; and a natural transformation of functors, describing 
an "independent product of distributions" . 
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The approach developed in this paper allows to express easily in terms 
of IT-categories such concepts as distribution, joint and conditional distri- 
butions, independence, and others. It is shown that on the basis of these 
concepts it is possible to formulate fairly general statement of decision- 
making problem with a prior information, which generalizes the Bayesian 
approach in the theory of statistical decisions. Moreover, the Bayesian prin- 
ciple, derived below, like its statistical prototype [28], reduces the problem 
of optimal decision strategy construction to a significantly simpler problem 
of finding optimal decision for a posterior distribution. 

Among the most important concepts in categories of ITs is the concept 
of (relative) informativeness of information transformers. There are two 
different approaches to the concept of informativeness. 

One of these approaches is based on analyzing the "relative positions" 
of information transformers in the corresponding mathematical structure. 
Roughly speaking, one information transformer is regarded as more infor- 
mative than another one if with the aid of an additional information trans- 
former the former one can be "transformed" to an IT, which is similar to 
(or more "accurate" than) the letter one. In fact, this means that all the 
information that can be obtained from the latter information transformer 
can be extracted from the former one as well. 

The other approach to informativeness is based on treating informa- 
tion transformers as data sources for decision-making problems. Here, one 
information transformer is said to be semantically more informative than 
another if it provides better quality of decision making. Obviously, the no- 
tion of semantical informativeness depends on the class of decision-making 
problems under consideration. 

In the classical researches of Blackwell [29,30], the correspondence be- 
tween informativeness (Blackwell sufficiency) and semantical informative- 
ness (Blackwell informativeness) were investigated in a statistical context. 
These studies were extended by Morse, Sacksteder, and Chentsov [9-12], 
who applied the category theory techniques to their studies of statistical 
systems. 

It is interesting, that under very general conditions the relations of infor- 
mativeness and semantical informativeness (with respect to a certain class of 
decision-making problems) coincide. Moreover, in some categories of ITs it 
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is possible to point out one special decision problem, such that the resulting 
semantical informativeness coincides with informativeness. 

Analysis of classes of equivalent (with respect to informativeness) in- 
formation transformers shows that they form a partially ordered Abelian 
monoid with the smallest (also neutral) and the largest elements. 

One of the objectives of this paper is to show that the basic constructions 
and propositions of probability theory and statistics playing the fundamen- 
tal role in decision-making problems have meaningful counterparts in terms 
of IT-categories. Furthermore, some definitions and propositions (for ex- 
ample, the notion of conditional distribution and the Bayesian principle) 
in terms of IT-categories often have more transparent meanings. This pro- 
vides an opportunity to look at the well known results from a different angle. 
What is even more significant, it makes it possible to apply the methodology 
of statistical decision- making in an alternative (not probabilistic) context. 

Approaches, proposed in this work may provide a background for con- 
struction and study of new classes of ITs, in particular, dynamical nonde- 
terministic ITs, which may provide an adequate description for information 
flows and information interactions evolving in time. Besides, a uniform ap- 
proach to problems of information transformations may be useful for better 
understanding of information processes that take place in complex artificial 
and natural systems. 

2 Categories of information transformers 

2.1 Common structure of classes of information trans- 
formers 

It is natural to assume that for any information transformer a there are 
defined a couple of spaces: A and B, the space of "inputs" (or input sig- 
nals) and the space of "outputs" (results of measurement, transformation, 
processing, etc.). We will say that a "acts" from A to B and denote this as 
a: A — > B. It is important to note that typically an information transformer 
not only transforms signals, but also introduces some "noise" . In this case 
it is nondeterministic and cannot be represented just by a mapping from A 



5 



to B. 

It is natural to study information transformers of similar type by aggre- 
gating them into families endowed by a fairly rich algebraic structure [13,18]. 
Specifically, it is natural to assume that families of ITs poses the following 
properties: 

(a) If a: A — > B and b: B — > C are two ITs, then their composition 
b o a: A — > C is defined. 

(b) This operation of composition is associative. 

(c) There are certain neutral elements in these families, i.e., ITs that 
do not introduce any alterations. Namely, for any space B there exist a 
corresponding IT i^.B^B such that i B o a = a and b o % = b. 

Algebraic structures of this type are called categories [14, 16]. 

Furthermore, we will assume, that to every pair of information trans- 
formers, acting from the same space T> to spaces A and B respectively, there 
corresponds a certain IT a * b (called product of a and b) from T> to A x B. 
This IT in a certain sense "represents" both ITs a and b simultaneously. 
Specifically, ITs a and b can be "extracted" from a*b by means of projections 
n AB and u AB from A x B to A and B, respectively, i.e., ir AB o (a * b) — a, 
v AB o (a * b) = b. Note, that typically, an IT c such that ir AB o c = a, 
u AB o c — b is not unique, i.e., a category of ITs does not have products 
(in category-theoretic sense [14-17]). Thus, the notion of a category of ITs 
demands for an accurate formalization. 

Analysis of classes of information transformers studied in [4,5,13,18-23, 
25] , gives grounds to consider these classes as categories that satisfy certain 
fairly general conditions. 

2.2 Categories: basic concepts 

Recall that a category (see, for example, [14-17]) C consists of a class of 
objects Ob(C), a class of morphisms (or arrows) Ar(C), and a composition 
operation o for morphisms, such that: 

(a) To any morphism a there corresponds a certain pair of objects A and 
B (the source and the target of a), which is denoted a:A^B. 
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(b) To every pair of morphisms a: A — > B and b: B — > C their composition 
b o a: A — > C is defined. 

Moreover, the following axioms hold: 

(c) The composition is associative: 

co {bo a) = (cob) o a. 

(d) To every object 1Z there corresponds an (identity) morphism i^. 1Z — > 
7?., so that 

Va: A^> B, a o i = a = i g o a. 

A morphism a: .4 — > £> is called isomorphism if there exists a morphism 
b: B — > .4 such that a o b = i B and b o a = i A . In this case objects .4 and £> 
are called isomorphic. 

Morphisms a: X> — > ^4 and 6: X> — > £> are called isomorphic if there exists 
am isomorphism c: .4 — > £> such that co a = b. 

An object Z is called terminal object if for any object A there exists a 
unique morphism from A to which is denoted z^. A Z in what follows. 

A category D is called a subcategory of a category C if Ob(D) C Ob(C), 
Ar(D) C Ar(C), and morphism composition in D coincide with their com- 
position in C. 

It is said that a category has (pairwise) products if for every pair of 
objects A and B there exists their product, that is, an object A x B and a 
pair of morphisms n AB : Ax B — > A and u AB : Ax B ^ B, called projections, 
such that for any object D and for any pair of morphisms a: T> — > *4 and 
b:V ^ B there exists a unique morphism c: £> — > „4 x £>, that yields a 
commutative diagram: 



C 
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i.e., satisfies the following conditions: 

7r AB oc = a, v AB o C = b. (1) 

We call such morphism c the product of morphisms a and b and denote 
it a * b. 

It is easily seen that existence of products in a category implies the 
following equality: 

(a * b) o d = (a o d) * (b o d). (2) 

In a category with products, for two arbitrary morphisms a: A — > C and 
b:B — > T> one can define the morphism 0x6: 

axb: AxB^CxV, a x b d = (a o 71^ g ) * (6 o 1/ g ). (3) 

This definition and obviously imply that the morphism c = a x 6 satisfy 
the following conditions: 

V oc = ao V ^c,© ^ 60 ^ ( 4 ) 

Moreover, c = a x b is the only morphism satisfying conditions 

It is also easily seen that (0) and © imply the following equality: 

(a * b) o (c* d) = (a o c) * (b o d). (5) 

Suppose ^IxB and B x A are two products of objects A and £> taken in 
different order. By the properties of products, the objects Ax B and B x A 
are isomorphic and the natural isomorphism is 

a AB . AxB^BxA, a AB = v AB * vr^, (6) 
i.e., a unique morphism that makes the following diagram commutative: 

AxB 

y I x 

V / \ 7T 

AJ&/ I rr \ AB 

I AS 
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Moreover, for any object T> and for any morphisms a: T> — > A and b: T> — > 
B, the morphisms a * b and b * a are isomorphic, that is, 

a A B o (a * 6) = 6 * a. (7) 

Similarly, by the properties of products, the objects (Ax B) x C and 
A x (B x C) are isomorphic. Let 

a ABC : (A x B) x C -> A x (B x C) 

be the corresponding natural isomorphism. Its "explicit" form is: 



It can be easily obtained with the following diagram: 



AxB 




(AxB)xC 




B,C 



c 




Ax(BxC) A,BxC ' BxC 



Then for any object V and for any morphisms a: V 
c:V — > C we have 

a A b e ° (( a * ^) * c) = a * (6 * c). 



.A, 6:P 



(8) 



B, and 
(9) 



2.3 Elementary axioms for categories of information 
transformers 

In this subsection we set forward the main properties of categories of ITs. 
All the following study will rely exactly on these properties. 
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In [13,18,19,21,23,25] it is shown that classes of information transform- 
ers can be considered as morphisms in certain categories. As a rule, such 
categories do not have products, which is a peculiar expression of nonde- 
terministic nature of ITs in these categories. However, it turns out that 
deterministic information transformers, which are usually determined in a 
natural way in any category of ITs, form a subcategory with products. 
This point makes it possible to define a "product" of objects in a category 
of ITs. Moreover, it provides an axiomatic way to describe an extension 
of the product operation from the subcategory of deterministic ITs to the 
whole category of ITs. 

Definition 2.1 We shall say that a category C is a category of information 
transformers if the following axioms hold: 

1. There is a fixed subcategory of deterministic ITs D that contains all 
the objects of the category C (Ob(D) = Ob(C)). 

2. The classes of isomorphisms in D and in C coincide, that is, all the 
isomorphisms in C are deterministic. 

3. The categories D and C have a common terminal object Z. 

4. The category D has pairwise products. 

5. There is a specified extension of morphism product from the subcate- 
gory D to the whole category C, that is, for any object T> and for any 
pair of morphisms a: T> — > A and b: T> — > B in C there is certain infor- 
mation transformer a * b:V ^ A x B (which is also called a product 
of ITs a and b) such that 



6. Let a: A — * C and b:B^T> are arbitrary ITs in C, then the IT a x b 
defined by Eq. (jnj) satisfies Eq. (JSJ): 



n A B o(a*b) 



a 



v A B o (a * b) = b. 



(a x b) o (c * d) = (a o c) * (b o 



d). 
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7. Equality (JJJ) holds not only in D but in C as well, that is, product of 
information transformers is " commutative up to isomorphism." 

8. Equality Q also holds in C. In other words, product of information 
transformers is " associative up to isomorphism" too. 

Now let us make several comments concerning the above definition. 

We stress that in the description of the extension of morphism product 
from the category D to C (Axiom |3J) we do not require the uniqueness of 
an IT c: V — > A x B that satisfies conditions (1). 

Nevertheless, it is easily verified, that the equations (jlj) are valid for 
c = a x b not only in the category D, but in C as well, that is, 

\,v °(a*b) = ao tt a b , v c v o(axb) = bo v A & 

However, the IT c that satisfy the equations may be not unique. Note 
also that in the category C Eq. (2) in general does not hold. 
Further, note that Axiom |H1 immediately implies 

(flxJ)o(cxd) = (floc)x(fto(|). 

Finally, note that any category that has a terminal object and pairwise 
products can be considered as a category of ITs in which all information 
transformers are deterministic. 

3 Fuzzy logic and fuzzy quantifiers 

In this section we shall introduce the fuzzy interpretation of formulas of first 
order logic which is an extension of the classical interpretation. 

3.1 Truth values and operations of fuzzy logic 

Let X be the closed interval X = [0, 1] - the set of fuzzy truth values. For 
the sake of simplicity of notations we shall denote any fuzzy proposition a 
and its truth value (evaluation of a) by the same symbol, i.e., a G X, but 
we shall mark fuzzy logical operations by dot in order to avoid ambiguity. 
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Let a and b be any fuzzy propositions, i.e., a, b G X. We put by definition 

for /itzzy disjunction, conjunction, negation, and implication: 



a\J b = max(a, 6) 
a A 6 = f min(a, 6), 




a^b 



def 



Let A(x) be any re-indexed family of propositions x E X , then define 

multivalent disjunction and conjunction: 



3.2 Fuzzy sets as fuzzy predicates 

When we consider some family of fuzzy propositions A(x), x <E X we may 
say that A is a fuzzy property (predicate, characteristics) for elements of 
the set X . That is an element x of X satisfies A with the degree Aix) G X. 
On the other hand any fuzzy property of elements of X may be considered 
as a fuzzy subset of X . 

In what follows we shall identify a fuzzy property, the corresponding 
fuzzy set and its characteristic function and hence use the same notation 
for all three (conceptually different!) notions. We shall also find it very 
convenient to denote the grade of membership of a; to a fuzzy set A by (±A, 
that is x G A = f A(x). 

3.3 Fuzzy quantifiers 

Quantifier in classical logic may be treated as conjunction or disjunction of 
a family of propositions. Specifically, let A(x) be any (classical) predicate, 
then the formula Vrc A(x) takes the truth value 1 (true) iff A(x) = 1 for all 
x. 

Now for any fuzzy predicate A(x), x G X define fuzzy universal and 
existential quantifiers: 



\/ x A(x) a ^ supA(x), 



/\ x A(x) a ^ MA(x). 



X 



V x A(x) 



dcf 



3 xA(x) 



dcf 
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3.4 Bounded quantifiers 



Bounded quantifiers are convenient for manipulations with the statements 
of the kind: "all x such that... satisfy..." and "there exists x such that... 
satisfying...". We shall need fuzzy quantifiers with fuzzy bounds. 
By analogy with the classical logic let us put by definition: 

S/ xeA B(x) = Vx 

JxeAB(x) = 3x(xeA AB(x)). 

Note that the bounded quantifiers are related in the same way as the 
unbounded ones: 

^ (V x(E AB(x)) =3 x € A ^B(x). 

4 Algebra of fuzzy sets 

The algebra of sets in the classical set theory is determined by the underlying 
classical logic. The systematic extension of set theoretic notations allows to 
introduce the algebra of fuzzy sets in a very natural way. 

4.1 Definitions of fuzzy sets by comprehension 

Let tp be any fuzzy property for elements x of X. This property may be 
any formula in the fuzzy logic discussed above. By A = {x\ip(x)} we shall 
denote the fuzzy set such that the level of membership of x to A is equal to 
the truth value of (p(x), i.e.: 

A = {x\ip(x)} iff x(EA = <p(x). (10) 

4.2 Fuzzy sets operations 

Suppose that A and B are any fuzzy sets over the space X . The principle 
(10) provides definitions of union, intersection, and complement for fuzzy 
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sets: 

A\3 B = {x\xe A\J xe B}, 

A Q B = {x\xe A AxE B}, 

A d ^ {x\ ^ (xe A}. 

Now let a(y) be any family of fuzzy sets on X indexed by y G y. By 
definition, put 

\3 v a (v) = {x\3 yx(± a(y)}, 

P\ y a (y) = {x\S/ yxea{y)}. 

Finally, if the index y itself varies in a fuzzy set Y, then operations for 
fuzzy families of fuzzy sets are easily described by bounded fuzzy quantifiers: 

\J y(EY a{y) = {x|3 ye Yx(ta{y)}, 
p\ y( z Y a{y) = {x|V y& xe a{y)}. 



4.3 Containment of fuzzy sets 

In contrast to the traditional definition of containment of fuzzy sets: 11 A C B 
iff A(x)B(x) for all x" we shall adopt another definition of containment, 
which is the natural extension of containment of crisp sets. In our approach 
U A is contained in B" is a fuzzy property. Hence it may take fuzzy truth 
values. By definition, put 

A(L B = VxeA xeB, 

that is 11 A is contained in B" with the degree in which "all the elements 
contained in A belong to B" (in these phrases we emphasized fuzzy notions). 



5 Fuzzy sets and distributions 

We shall say that a fuzzy distribution (or possibility distribution [3]) X on 
the space X is any fuzzy set in X. Note that we consider a fuzzy distribution 
as an analog of a probabilistic distribution. 
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5.1 Joint distributions 

We shall say that any fuzzy distribution C on X x y determines a fuzzy 
joint distribution [6] of the pair (x, y) , x e X, y e 3^- 

Sometimes it is convenient to interpret a joint distribution as a fuzzy 
relation, i.e., to consider C(x, y) as the "degree in which x and y satisfy the 
relation C". 

The distribution C* on y x X is called the converse of C if 

C7-(i/,x) = C(x,y). 

The joint distribution C induces the marginal distributions [4,5] X and 
F on the spaces X and y respectively: 

X = {x\ 3 y C(x, y)}, F = {y\ 3 rr C(z, y)}. 

5.2 Transition distributions 

The notion of a transition distribution [8] is of great importance in mathe- 
matical statistics. 

We shall say that a fuzzy transition distribution from X to y is any map 
a from X to the family of all fuzzy distributions on y, that is, a takes each 
element x £ X to some fuzzy distribution a(x) on the space 3^- 

Note that the condition A(x,y) = a(x)(y) for all x in X for all y in 
y determines the one-to-one correspondence between A and a. Using set 
theoretic notations we may write: 

A = {(x,y)\a(x)(y)}, a(x) = {y\A(x,y)}. 

By a we shall denote the transition distribution, such that a(x) == a(x). 
Obviously, a corresponds to A. 

The correspondence between joint distributions and transition distribu- 
tions reveals the algebraic semantics of the marginal distributions. Let a 
and P be transition distributions corresponding to A and A' respectively. 
Then 

X = {x\3y /%)(*)}= U„0(V), 
Y = {y\3 x a(x)(y)} = \J X a(x). 
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5.3 Images of distributions 

As a probabilistic transition distribution transforms one probability distri- 
bution to another [8], in the same way a fuzzy transition distribution takes 
some fuzzy distribution to another one. 

Let X be any distribution on X and a is some transition distribution 
from X to y. We say that a distribution aX on y is the image of the 
distribution X induced by the transition distribution a if 

aX = {y\ ^xeXye a(x)} = \J x ^ x a(x). 

There is an interesting notion dual to the notion of the image. We say 
that a distribution a o X on y is the lower image of the distribution X 
induced by the transition distribution a if 

aoX = {y\S/xeX ye a(x)} = f] x ^ x a(x). 

It may be proved that ao X = aX. 

5.4 Generated joint distributions 

Suppose that X is an arbitrary distribution in X and a is some transition 
distribution from X to y. We say that the distribution a * X in X x y is 

the joint distribution generated by a and X if 

a * X = f {(x, y)\xeX Ay£a(x)}. 

It can be easily verified that the image aX coincides with the ^-marginal 
distribution of the generated joint distribution a * X. 

5.5 Fuzzy conditional distributions 

We shall define this notion in the same way as its probabilistic prototype is 
defined in statistics [4,5,8]. 

Let A be any joint distribution on X x y . We say that some transition 
distribution a from X to y is (a variant of) the conditional distribution for 
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A with respect to X if A is generated by its A'-marginal distribution X 
together with a: 

A = a * X. 

Similarly we say that (3 from X to y is a conditional distribution for A 
with respect to y if 

A' — P*Y. 

Theorem 5.1 For any joint distribution A in X x y, conditional distribu- 
tions always exist. Some variants of conditional distributions a and (3 are 
determined by the following expressions: 

a(x) = {x\ A(x, y)}, P(y) = {y\ A(x, y)}. 
5.6 Iterated quantifiers 

The possibility of representation of any joint distribution by conditional 
distributions has interesting consequences that will be used in the Bayesian 
decision analysis. 

Theorem 5.2 Let A be any joint distribution on X x y, X and Y - 
its marginal distributions, a and f3 - - conditional distributions, that is 
A = a*X = (P* F)\ Then for any formula ip 

V (x,y)e A^p =V xe XV yea(x)p =V ye YS/ xe (3(y)(p, 
3 (x,y)e Ap =3 xe X3 ye a(x)p =3 ye F3 xe /3(y)ip. 

6 Semantics of decision making problems 

6.1 Fuzzy games with a priory information 

Let X be some space of objects of interest; let T> be some space of decisions; 
and let G be a fuzzy relation between X and T>, i.e., a fuzzy set in X x T> 
that determines the notion "good decision," i.e., G = {{x,d)\d is a good 
decision for x}. 
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We are dealing in fact with the two-person game [7] (X,V,G), where 
X is the decision space of the first player, V is the decision space of the 
second player, and the fuzzy notion of the quality of decisions for the second 
player G replaces the classical loss function. Hence we shall call {X,V,G) 
the fuzzy game [4,5]. If it is known a priori that all choices of the first 
player are restricted by a fuzzy set X, then we are dealing with the fuzzy 
game with the a priory information [4,5]. 

6.2 Good decisions 

Let us now define the notion Gx = 11 for all x in X decision d is good for x" 
i.e., formally: 

G x =Vx€ XG(x,d). (11) 

Truth value of this formula determines the quality of the decision d for X. 
Note that the distribution of good decisions permits the following interpre- 
tation. Let the transition distribution 7 from X to T> is determined by 
7(2;) = {d\G(x, d)}, i.e., the distribution of "good decisions for x" . Let 7* 
be the opposite to 7. Then 

G x = {d\Vx(E lis 7 *(d)} = {x\X(L 7'(d)}- 

Another interpretation for Gx — its algebraic semantics — comes from 
the representation: 

G x = fWrO«0=7°*- 

Note that in statistics the loss function is often used. Suppose that some 
joint fuzzy distribution B on X x V is fixed and let (5 be the associated 
transition distribution (3(x) = ll bad decisions for x" = {d\(x,d)e B}. Then 
the fuzzy set of bad decisions with respect to the a priory distribution X 
may be defined by: 

B x = {d\3x(EXB(x,d)}= [J x(Ex (3(x)=(3X. 
However if B = G (it is quite natural) then we have (5 — 7 and 
B x = PX = 7X = T°X = Gx, 
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so the two approaches are dual to each other. 

Let us consider an optimal decision problem for the second player with 
respect to the a priory information X. The statement "there exists an 
optimal decision for the second player" may be expressed by 

3 dVie XG(x,d) = 3 de G x . (12) 

We shall say that a decision dx, is optimal (or a Bayes decision) with respect 
to the a priori distribution X (or simply X — optimal) if G x (d) takes on its 
maximum at d x . 

6.3 Optimal decision strategies 

Now consider problems of constructing optimal decision strategies in fuzzy 
experiments. Suppose that a fuzzy experiment from X to y is determined 
by a fuzzy transition distribution a. A decision strategy (or simply strategy) 
r is any mapping from the observation space y to the decision space T>. 
Let us define the relation H(x,r) = "decision strategy r is good for x" 

by 

H(x,r) = f V y€ a(x)r(y)€ j(x). 

It may be shown that H(x,r) = ra(x)<Z r ){x). 

In fact, we have come to a new two-person game (X, V y , H), where the 
decision space of the second player is replaced by the set of all strategies, 
and the goodness relation G for decisions is replaced by the goodness for 
strategies. Such games are analogous to classical statistical games [7]. 

Finally for a given a priory distribution X we define the Bayes goodness 
for a strategy with respect to X, i.e., the fuzzy set on V y , Hx(r) = "r is 
good for all x in X" 

H x (r) = f Vx€ XH(x,r). (13) 

The optimal decision strategy problem is one of obtaining a mapping r x , 
called a Bayes strategy, such that the level of goodness for r x in (13) is the 
highest. 
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6.4 Bayes principle 

A well-known result of the theory of statistical games states that an optimal 
decision strategy problem for a statistical game may be reduced to a sim- 
ilar problem for an original game [7]. This reduction involves conditional 
distributions. 

Theorem 6.1 Let X be any a priori distribution in X , a any transition 
distribution from X to y , and f3 a variant of a conditional distribution 
for a * X with respect to y. Assume that for every yin,y there exists a 
Bayes decision dp( y \, which is optimal for the original decision problem with 
respect to the distribution f3(y). Then the optimal decision strategy rx and 
the corresponding goodness Hx{tx) are determined by 

rx(y) = d m , H x (r x )=Sf ye YG p(y) (d m ) . 

For an arbitrary decision strategy r 

H x (r) =V xe XV ye aXr(y)e 7(2;) =V (x,y)e a *Xr(y)e 7(2;) = 
= Vy€ «IVie PYr(y)e 7(2;) =Vye aXr(y)eGp(y). 

In the calculations above we utilize the definition of the conditional distri- 
bution and in the last step we use the definition of Gx{d) (2) for X = (3(y). 

Hence, by virtue of the optimality of the Bayes decision d^, we have 
for any y 

G/3(y) (r(y))G m (d m ) = G m (r x (y)) ■ 
Therefore we obtain 

H x (r) = V y(£aX G m (r(y)) \] y (z aX Gp (y) (r x (y)) = H x (r x ). 

Note that the notion of conditional distribution and tightly related with 
it "interchange" of quantifiers play the principal role in the proof. 

The Bayes principle is very intuitive. It asserts that to construct (or cal- 
culate) an optimal strategy rx for a given observation y one has to find the 
conditional distribution of x for a given y, i.e., f3y and then take a decision 
dp(y), which is optimal with respect to this distribution. In other words, the 
observation of y results in the passage from the a priori information X to 
the a posteriory information (3(y). 
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7 Informativeness of information transform- 



ers 

7.1 Accuracy relation 

In order to define informativeness relation we will need to introduce first 
the following auxiliary notion. 

Definition 7.1 We will say that > is an accuracy relation on an IT-category 
C if for any pair of objects A and B in C the set C(A, B) of all ITs from A 
to B is equipped with a partial order > that satisfies the following mono- 
tonicity conditions: 

a > a', b > b' =>- a o b > a' o 5', 

a > a', b > b' =>- a * b > a' * b' . 

Thus, the composition and the product are monotonous with respect to 
the partial order >. For a pair of ITs a, b e C(A, B) we shall say that a is 
more accurate then b whenever a > b. 

It obviously follows from the very definition of the operation x (JSJ) and 
from the monotonicity conditions that the operation x is monotone as well: 

a > a', 6 > 6' =>- a x 6 > a' x b' . 

It is clear that for any IT-category there exists at least a "trivial variant" 
of the partial order >, namely, one can choose an equality relation for >, 
that is, one can put a > b -4^> a = b. However, many categories of ITs 
(for example, multivalued and fuzzy ITs) provide a "natural" choice of the 
accuracy relation, which is different from the equality relation. 

7.2 Definition of informativeness relation 

Suppose a: T> — > A and b:T> —>■ B are two information transformers with 
a common source T>. Assume that there exists an IT c: A — > B such that 
co a = b. Then any information that can be obtained from b can be obtained 
from a as well (by attaching the IT c next to a). Thus, it is natural to 
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consider the information transformer a as being more informative than the 
IT b and also more informative than any IT less accurate than b. 

Now we give the formal definition of the informativeness relation in the 
category of information transformers. 

Definition 7.2 We shall say that an information transformer a is more 
informative (better) than b if there exists an information transformer c such 
that c o a > b, that is, 

, def — I , 

a)p b •<=>- dc c o a > b. 

It is easily verified that the informativeness relation )p is a preorder on 
the class of information transformers in C. This preorder )p induces an 
equivalence relation ~ in the following way: 

a ~ b <^=>- a >p b & b )p a. 

Obviously, the relation "more informative" extends the relation "more 
accurate," that is, 

a > b a )p= b. 

7.3 Main properties of informativeness 

It can be easily verified that the informativeness relation )p satisfies the 
following natural properties. 

Lemma 7.3 Consider all information transformers with a fixed source T>. 



(a) The identity information transformer i v is the most informative and 
the terminal information transformer z v is the least informative: 

Va i v )p a)p Zp. 

(b) Any information transformer a:T> — > B x C is more informative than 
its parts n B c o a and u B c o a. 
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(c) The product a*b is more informative than its components 

a * b )p a,b. 

Furthermore, the informativeness relation is compatible with the com- 
position and the product operations. 

Lemma 7.4 (a) If a )p b, then a o c )p b o c. 
(b) If a >p b and c )p e, then a * c )p b * e. 

7.4 Structure of the family of informativeness equiv- 
alence classes 

Let a be some information transformer. We shall denote by [a] the equiva- 
lence (with respect to informativeness) class of a. We shall also use boldface 
for equivalence classes, that is, a G a is equivalent to a = [a] . 

Theorem 7.5 Let3(T>) be the family of informativeness equivalence classes 
for the class of all information transformers with a fixed domain T>. The 
family ^(V) forms a partial ordered Abelian monoid {3(T>), )p, *, 0) with the 
smallest element and the largest element 1, where 

[a] * [b\ 44 a^b, [a] * [b] d ^ [a * b], d = f fcj, 1 d = f [i v \. 
Moreover, the following properties hold: 

(a) * a = a, 

(b) l*a=l, 

(c) a ^ 1, 

(d) a * b )p a, b, 

(e) (a )p b) & (c ^ e) =>- a * c )p b * e. 
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7.5 Informativeness structure of concrete IT-categories 



7.5.1 Linear stochastic ITs 

In addition to the trivial relation of accuracy (which coincides with the 
equality relation) one can define the accuracy relation in the following way: 

(A a , E a > > (A b , E 6 ) 44 A a = A b , S a < E b . 

However, it can be proved that the informativeness relations corresponding 
these different accuracy preorders, actually coincide. 

Theorem 7.6 In the category of linear information transformers every 
equivalence class [a] corresponds to a pair (Q,S), where Q C V is an 
Euclidean subspace and S: Q — > Q is nonnegative definite operator, that is, 
5^0. In these terms 

(Qi,S 1 )^(Q 2 ,S 2 ) 44 Qi D Q 2 , S 1 \Q 2 ^S 2 . 

Here S± \ Q 2 (the restriction of S± on Q 2 ) is defined by the expression 

Si I" Qi == P 2 I\S\P\I 2 , where If Qj — > V is the subspace inclusion, and 
PfV — > Qj is the orthogonal projection (cf. [18,33]). 

7.5.2 The category of sets as a category of ITs 

It is not hard to prove that for a given set V, the class of equivalent in- 
formativeness for an IT a with the set V being its domain, is completely 
determined by the following equivalence relation ^ a on V: 

def i i ~ 

x ~ a y clx = ay vx, y G D. 

Furthermore, a ^ b if and only if the equivalence relation w a is finer than 
~6, that is, 

a )p b Wx,y eV (x ^ a y =>- x « 6 y) . 

It is clear that we have the following 
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Theorem 7.7 The partially ordered monoid of equivalence classes for ITs 
with the source V, is isomorphic to the monoid of all equivalence relations 
on V equipped with the order u Gner" and with the product: 

x (« a * « 6 ) y 44 (x ^ a y, x ^ b y) Vx, y e V. 
7.5.3 Multivalued ITs 

In addition to the trivial accuracy relation in the category of multivalued 
ITs one can put 

def _ 

a > b -<=>- Mx G V ax C bx. 

These two accuracy relations lead to different informativeness relations [21], 
called (strong) informativeness )p and weak informativeness )p. 

For the both informativeness relations the classes of equivalent ITs with 
a fixed source V can be described explicitly. 

Theorem 7.8 In the category of multivalued ITs with weak informative- 
ness every class of equivalent ITs corresponds to a certain covering V of the 
set T>, such that if V contains some set B then it contains all its subsets: 

(3B eV(ACB)) A G V. 

Moreover, a covering V\ is more (weakly) informative than V2 (namely V\ 
corresponds to a class of more (weakly) informative ITs than V 2 ) If V\ is 
contained in P 2 , that is, 

Theorem 7.9 In the category of multivalued ITs with strong informative- 
ness every class of equivalent ITs corresponds to a covering V of the set V, 
that satisfy the following condition: 

( (3B eVACB) & (3BCV A = (JB)) => A G V. 

In this case 

Vi)pV 2 44 ( (VA G V\ 3BeV 2 ACB) 

& (VS eV 2 3ACV! B = \JA)Y 
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8 Informativeness and decision-making prob- 
lems 



In this section, we consider an alternative (with respect to the above) ap- 
proach to informativeness comparison. This approach is based on treating 
information transformers as data sources for decision-making problems. 

8.1 Decision- making problems in categories of ITs 

Results of observations, obtained on real sources of information (e.g. indi- 
rect measurements) rule unsuitable for straightforward interpreta- 
tion. Typically it is assumed that observations suitable for interpretation 
are those into a certain object U which in what follows will be called object 
of interpretations or object of decisions. 

By an interpretable information transformer for signals from an object 
T> we mean any information transformer a:T> —*U. 

It is usually thought that some interpretable information transform- 
ers are more suitable for interpretation (of obtained results) than others. 
Namely, on a set C(V,U) of information transformers from V to U, one de- 
fines some preorder relation ^>, which specifies the relative quality of various 
interpretable information transformers. Typically the relation ^> is prede- 
termined by the specific formulation of a problem of optimal information 
transformer synthesis (that is, decision- making problem). 

We shall say that an abstract decision-making problem is determined by 
a triple (T>,U, ^>), where V is an object of studied (input) signals, U is 
an object of decisions (or interpretations), and ^> is a preorder on the set 



that is, more accurate IT provides better quality of interpretation. 

For a given information transformer a: T> — > A we shall also say that an 
IT b reduces a to an interpretable information transformer if b o a: T> — > U, 
that is, if b: A — > U. Such an information transformer b will be called a 

decision strategy. 
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The set of all interpretable information transformers obtainable on the 
basis of a: V — > A will be denoted U a C C(D,U): 

U a d = {boa | b.A^U). 

We shall call a decision strategy r: A — > U optimal (for the IT a with 
respect to the problem (T>,U,^>)) if the IT r o o is a maximal element 
in U a with respect to Thus, a decision-making problem for a given 
information transformer a is stated as the problem of constructing optimal 
decision strategies. 

8.2 Semantical informativeness 

The relation ^> induces a preorder relation □ on a class of information 
transformers operating from V in the following way. 

Assume that a and b are information transformers with the source £>, 
that is, a: V — > A, b: V — > £>. By definition, put 

a □ b 44 V6': B -> U 3a': ^ -> W a! o a > 6' o b. 

In other words, a □ 6 if for every interpretable information transformer 
d derived from b there exists an interpretable information transformer c 
derived from a such that c ^> d, that is, 

a □ b VdeUb 3c G U a c> d. 

It can easily be checked that the relation □ is a preorder relation. 

It is natural to expect that if one information transformer is more in- 
formative than the other, then the former will be better than the latter in 
any context. In other words, for any preorder ^> on the set of interpretable 
information transformers the induced preorder □ is dominated by the infor- 
mativeness relation )j= (that is, □ is weaker than The converse is also 
true. 

Definition 8.1 We shall say that an information transformer a is semanti- 
cally more informative than b if for any interpretation object U and for any 
preorder ^> (on the set of interpretable information transformers) a □ b for 
the induced preorder □. 
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The following theorem is in some sense a "completeness" theorem, which 
establishes a relation between "structure" (b can be "derived" from a) and 
"semantics" (a is uniformly better then b in decision-making problems). 

Theorem 8.2 For any information transformers a and b with a common 
source T>, information transformer a is more informative than b if and only 
if a is semantically more informative than b. 

Let us remark that the above proof relies heavily on the extreme ex- 
tent of the class of decision problems involved. This makes it possible to 
select for any given pair of ITs a, b an appropriate decision-making prob- 
lem {V,Ub, ^>b) in which the interpretation object Ub and the preorder ^>b 
depend on the IT b. However, in some cases it is possible to point out a 
concrete (universal) decision-making problem such that 

a )p b a □ b. 

Theorem 8.3 Assume that for a given object V there exists an object V 
such that for every information transformer acting from T> there exists an 
equivalent (with respect to informativeness) IT acting from V to V, that is, 

Vi3 W.V^B 3b':V^V b~b'. 

Let us choose the decision object U = f V and the preorder 3>, defined by 

c>fl <^=^ c > d. 

Then a)pb if and only if a □ b. 

Note that in general case an optimal decision strategy (if exists) can be 
nondeterministic. However, in many cases it is sufficient to search optimal 
strategies among deterministic ITs. Indeed, in some categories of informa- 
tion transformers the relation of "accuracy" satisfies the following condition: 
every IT is dominated by some deterministic IT, that is, for every IT there 
exists a more accurate deterministic IT. 
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Proposition 8.4 Assume that (T>,U,^$>) is a monotone decision- making 
problem in a category of ITs C. Assume also that the following condition 
holds: 

VcG Ar(C) 3deAr(D) d > c. 

Tien for any IT a:V — > 7?. and for any decision strategy r: 1Z — > W there 
exists a deterministic strategy r-.lZ —^U such that r Q o a ^> r o a. 

8.3 Decision-making problems with a prior informa- 
tion 

In this section we formulate in terms of categories of information trans- 
formers an analogy for the classical problem of optimal decision strategy 
construction for decision problems with a prior information (or information 
a priori). We also prove a counterpart of the Bayesian principle from the 
theory of statistical games [7,28]. Like its statistical prototype it reduces 
the problem of constructing an optimal decision strategy to a much sim- 
pler problem of finding an optimal decision for a posterior information (or 
information a posteriori) . 

First we define in terms of categories of information transformers some 
necessary concepts, namely, concepts of distribution, conditional informa- 
tion transformer, decision problem with a prior information, and others. 

8.4 Distributions in categories if ITs 

We shall say that a distribution on an object A (in some fixed category of 
ITs C) is any IT /: Z — > A, where Z is the terminal object in C. 

The concept of distribution corresponds to the general concept of an 
element of some object in a category, namely, a morphism from the terminal 
object (see, e.g., [17]). 

Any distribution of the form h: Z — > A x B will be called a joint dis- 
tribution on A and B. The projections n AB and on the components 
A and B respectively, "extract" marginal distributions f and g of the joint 
distribution h, that is, 

/ = 7f A t3 oh:Z^A, 
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9 = v AB oh:Z^B. 

We say that the components of a joint distribution ft: Z — > A x £> are 
independent whenever this joint distribution is completely determined by 
its marginal distributions, that is, 

h =( n A,B 0h )*( U A,B° h )- 

Let / be an arbitrary distribution on A and let a: A — > £> be some 
information transformer. Then the distribution g = a o / in some sense 
"contains an information about /." This concept can be expressed precisely 
of one consider the joint distribution generated by the distribution / and 
the IT a: 

ft: Z -> Ax B, ft = (i A * a) o f. 

Note, that the marginal distributions for h coincide with / and g, respec- 
tively. Indeed, 

V A,B ° k = V A,B ° * °) ° f = a ° f = 9- 

Let h be a joint distribution on A x B. We shall say that a: A — > i3 
is a conditional IT for ft, with respect to ^4 whenever ft is generated by the 
marginal distribution ^ AB °h and the IT a, that is, 

h=(i A *a)on AB oh. 
Similarly, an IT b: B — > ^4 such that 

h = (b*i B )ov AB oh 
will be called a conditional IT for ft with respect to £>. 

8.5 Bayesian decision-making problems 

Suppose that there are fixed two objects V and U in some category of ITs, 
namely, the object of signals and the object of decisions, respectively. In a 
decision-making problem with a prior distribution / on V one fixes some 
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preorder >j on the set of joint distributions onVxU for which ©-marginal 
distribution coincides with /. 

Informally, any joint distribution h on V x U of this kind can be con- 
sidered as a joint distribution of a studied signal (with the distribution 
/ = ir v u o h on T>) and a decision (with the distribution g = v vu ° h on U). 
The preorder ^> f determines how good is the "correlation" between studied 
signals and decisions. 

Formally, an abstract decision problem with a prior information is de- 
termined by a quadruple (T>,U, /, ^>/), where T> is an object of studied 
signals, U is an object of decisions (or interpretations), /: Z — > T> is a prior 
distribution (or distribution a priori), and is a preorder on the set of 
ITs h: Z — > V xW that satisfy the condition 7^ u o h = f. 

Furthermore, suppose that there is a fixed IT a: V — > 7?. (which deter- 
mines a measurement; 7£ can be called an object of observations). An IT 
r:TZ — > U is called optimal (for the IT a with respect to ^>j) if the distri- 
bution (i * r o a) o / is a maximal element with respect to ^>/. The set of 
all optimal information transformers is denoted Opt^(a o /). 

Theorem 8.5 (Bayesian principle). Let f be a given prior distribution 
on T>, let a: T> — > 7?. be a iixed IT, and let b:lZ ^ T> be a conditional 
information transformer for (i * a) o / with respect to TZ. Then the set of 
optimal ITs r: TZ — > W, namely, the set of optimal decision strategies for f 
over 00 / coincides with the set of optimal decision strategies for bo g over 
g, where g — a o f: 

Opt f {aof) = Opt bog {g). 

In a wide class of decision problems (e.g., in linear estimation prob- 
lems) an optimal IT r happens to be deterministic and is specified by the 
"deterministic part" of the IT b. 

For many categories of information transformers (for example, stochas- 
tic, multivalued, and fuzzy ITs [4,5,20,28]) an optimal decision strategy 
r can be constructed "pointwise" according to the following scheme. For 
the given "result of observation" y e TZ consider the conditional (posterior) 
distribution b(y) for / under a fixed g — y, and put 
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where d^ y ) is an optimal decision with respect to the posterior distribution 

Ky). 



8.6 Decision making problems in concrete categories 
of ITs 

8.6.1 Stochastic ITs 

Let us demonstrate here that the basic concepts of mathematical statistics 
are adequately described in terms of this IT-category. Namely, we shall 
verify that the concepts of distribution, conditional distribution, etc. (in- 
troduced above in terms of IT-categories), in the category of stochastic ITs 
lead to the corresponding classical concepts. 

Indeed, any probability distribution Q on a given measurable space A = 
(f2_4, & j) is uniquely determined by the morphism /: Z — > A from the 
terminal object Z = ({0}, {0, {0}}) (a one-point measurable space) such 
that 

P f (0,A)=Q(A) VAe& A . 

In what follows we shall omit the first argument in Pf(0,A) and write just 
Pf(A) instead. 

A statistical experiment is described by a family of probability measures 
Qg on some measurable space B. This family is usually parametrized by ele- 
ments of a certain set Q A . Sometimes (especially when statistical problems 
with a prior information are studied) it is additionally assumed that the 
set fl A is equipped by some a-algebra & A and that Qg(B) is a measurable 
function of 9 G Q A for all B G &b (and thus, Qg(B) is a transition proba- 
bility function [8]). Therefore, such statistical experiment is determined by 
the stochastic information transformer 

a: A — > B, where 

p a (e, b) = Qe(B) ye G n A , VB G & B . 

In the case when no cr-algebra on the set Q A is specified, one can put 
(5_4 = V(Q A ), that is, the cr-algebra of all the subsets of the set Q A . It 
is clear that in this case the function P a (9,B) = Qg(B) is a measurable 
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function of 6 £ f° r every fixed B G 6g and thus (being a transition 
probability function), is described by a stochastic IT a: A — > £>. 

Note also, that any statistic, being a measurable function, is represented 
by a certain deterministic IT. Decision strategies also correspond to deter- 
ministic ITs. At the same time, nondeterministic (mixed) decision strate- 
gies are adequately represented by stochastic information transformers of 
general kind. 

Now, let / be some fixed distribution on A and let a: A — » B be some 
IT. The joint distribution h on A x B, generated by / and a (from the 
IT-categorical point of view, see Section 18 .4|) is 



Thus we come to the well known classical expression for the generated joint 
distribution (see, for example, [8]). 

Now assume that Pf is considered as some probability prior distribution 
(or distribution a priori) on A. Then for a given transition probability 
function P a , a posterior (or conditional) distribution P&(u/, ■) on A for a 
fixed to' G fie is determined, accordingly to [8] by a transition probability 
function P&(u/, A), u' G fig, A G (5^ such that 



h — (i * a) o f. 

It means that for every set A x B, where A G and B G f^g, 
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P b (u', A) Pg(dw') MA G 6 A , VB G 6 B , 



where 
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It is easily verified that in terms of ITs the above expressions have the 
following forms: 

h = (b * i) o g, 



where 



9 = ao f- 



This shows, that the classical concept of conditional distribution is ade- 
quately described by the concept of conditional IT in terms of categories of 
information transformers. 

8.6.2 Linear stochastic ITs 

Note that in the category of linear information transformers every IT is 
dominated (in the sense of the preorder relation >) by a deterministic IT. 
Hence, according to Proposition ^. 4| in any monotone decision-making prob- 
lem without loss of quality one can search optimal decision strategies in the 
class of deterministic ITs. 

According to section |H3] any joint distribution in A x B is an IT h: Z — > 
A x £>, where Z is a terminal object in the category of linear ITs, i.e., 
Z = {0} is a 0-dimensional linear space. Thus, h = (0, where is 
a self-adjoint nonnegative operator in h: Z — > A x B. Operator S/ t can be 
represented in the following "matrix" form: 



where S /i5 =S 9i/ . 

It is shown in [19], that in the category of linear ITs for any joint dis- 
tribution there always exist conditional distributions. 

Theorem 8.6 For any joint distribution h: Z — » ^4 x B there exists condi- 
tional information transformers a: A —> B and b:B —>■ A. 

Variants of conditional information transformers are given by the for- 
mulas 




(14) 



a = 
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Here A~ we denotes the pseudoinverse operator for A [34]. 

If £ f > or S 9 > (in this case this operator is nonsingular and its 
pseudoinverse coincides with its inverse T,J = then the corresponding 

conditional information transformer a or b is unique. 

Thus in problems with a prior information one can apply Bayesian prin- 
ciple. Its direct proof in the category of linear ITs as well as the explicit 
expression for conditional information transformers can be found in [19]. 

8.6.3 Multivalued ITs 

In the category of multivalued information transformers every IT is dom- 
inated (in the sense of the partial order >) by a deterministic IT. Thus, 
in the monotone decision-making problem one can search optimal decision 
strategies in the class of deterministic ones. 

For every joint distribution in the category of multivalued ITs there exist 
conditional distributions [20]. 

It is clear, that any joint distribution in Ax B, (i.e., an IT h: Z — > Ax B) 
is specified by a subset H of A x £>, since a terminal object Z in the 
categories of multivalued ITs is a 1-element set. It is easy to see that 
for every joint distribution in the category of multivalued ITs there exist 
conditional distributions [20]. 

Theorem 8.7 For any joint distribution H in Ax B, conditional informa- 
tion transformers a: A — > B and b: B — > A always exist. Some variants of 
conditional ITs are determined by the following expressions: 

if x G p A H, 
if x £ p A H, 

if y e p B H, 
if y £ p B H. 

Here p^H and p A H denote projections of H on A and B respectively, e.g., 
p A H = {x e A | 3y e B (x, y) e H }. 



ax 



by 



{yeB \ (x,y)eH), 
B, 

{xeA \ (x,y) e H), 
A, 
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Therefore, in decision problems with a prior information, the Bayesian 
approach can be effectively applied. 

8.6.4 Categories of fuzzy information transformers 

Here we define two categories of fuzzy information transformers FMT and 
FPT that correspond to different fuzzy theories [4]. 

Objects of these categories are arbitrary sets and morphisms are every- 
where defined fuzzy maps, namely, maps that take an element to a normed 
fuzzy set (a fuzzy set A is normed if supremum of its membership function 
li A is 1). Thus, an information transformer a: A — > B is defined by a mem- 
bership function f^ ax {y) which is interpreted as the grade of membership of 
an element y G B to a fuzzy set ax for every element x G A. 

The category FMT. Suppose a: A — > B and b: B — > C are some fuzzy 
maps. We define their composition boa as follows: for every element x G A 
put 

= supmin^Jy), /^)). 

For a pair of fuzzy information transformers a: T> — > .4. and 6: X> — > £> 
with the common source P, we define their product as the IT that acts from 
£> to the Cartesian product Ax B, such that 

77ie category FPT. Define the composition and the product by the fol- 
lowing expressions: 

^( b oa)S Z ) = JUP (^JS/) %(^)) , 

In the both defined above categories of fuzzy information transformers 
the subcategory of deterministic ITs is (isomorphic to) the category of sets 
Set. Let g:A^ B be some map (morphism in Set). Define the corre- 
sponding fuzzy IT (namely, a fuzzy map, which is obviously, everywhere 
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defined) g: A — > B in the following way: 




1, if g(x) = y, 
0, if g(x) ^ y. 



Concerning the choice of accuracy relation, note, that in these IT-cate- 
gories, like in the category of multivalued ITs, apart from the trivial accu- 
racy relation one can put for a, b: A — > B 



In each fuzzy IT-category these two choices lead to two different informa- 
tiveness relations, namely the strong and the weak ones. 

Like in the categories of linear and multivalued ITs discussed above, 
monotone decision-making problems admit restriction of the class of optimal 
decision strategies to deterministic ITs without loss of quality. 

It is clear, that any joint distribution in AxB, (i.e., an IT h: Z — > AxB) 
is, in fact a normed fuzzy subset of Ax B, since a terminal object Z in the 
categories of fuzzy ITs is a 1-element set. Denote this fuzzy set by H . It 
is shown in [4, 5] that for every joint distribution in the categories of fuzzy 
ITs there exist conditional distributions. 

Theorem 8.8 For any joint distribution H in Ax B, conditional informa- 
tion transformers a: A — > B and b: B — > A always exist. Some variants of 
conditional ITs are determined by the following expressions: 
(a) in the category FMT 



a > b 



def 





P H (x,y), 
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(b) in the category FPT 



. -, if sup/x H (x,z) ^ 0, 

fijy) = \ SUP^ H (X,Z) 

1, if sup fi H (x, z) = 0, 

z 

fi H (x, y) 

, if sup n H (z,y) ^ 0, 



1, if sup// ff (z,y) = 0. 

2 

This allows Bayesian approach and makes use of Bayesian principle in 
decision problems with a prior information for fuzzy ITs [4,5,22,41], where 
connections between fuzzy decision problems and the underlying fuzzy logic 
are studied. 

The present paper, while very much a first step, lays the basis for num- 
ber of further applications. In paper [41] we propose some realizations for 
above categories, which we belive can be the basis for some interesting new 
directions in quantum computation and bioinformatics. 
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